Similarity/dissimilarity analysis of protein sequences using the spatial median as a descriptor

نویسنده

  • Mervat M. Abo-Elkhier
چکیده

A novel 3-D graphical representation of protein sequence has been introduced. A right cone of a unit base and unit height has been selected to represent protein sequences on its surface. The twenty amino acids have been represented by 20 circles and all protein’s residues have been represented by n lines on the cone’s surface. All the spots which represent the protein’s residues have been shown in the cone’s top view. The spatial median of all the spots is used as a new descriptor of any protein sequence. This approach was applied on two short segments of protein of yeast Saccharomyces cerevisiae. The examination of the similarities/dissimilarities for the eight ND5 proteins and the six β-globin proteins illustrate the utility of our approach. A linear correlation and significance analysis have been provided to compare our results and the percentage sequence alignment identity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of similarity/dissimilarity of DNA sequences based on adjacent nucleotide pair representation

Introduction of graphic representation for nucleotide or protein sequences can provide intuitive overall pictures as well as useful insights for performing large-scale similarity analysis. In this paper, we are analyzing the similarity/dissimilarity of the mitochondrial genome sequences from twenty four mammal species. The analysis is important in finding the relatedness among the species and e...

متن کامل

Comparative genomics of human stem cell factor (SCF)

Stem cell factor (SCF) is a critical protein with key roles in the cell such as hematopoiesis, gametogenesis and melanogenesis. In the present study a comparative analysis on nucleotide sequences of SCF was performed in Humanoids using bioinformatics tools including NCBI-BLAST, MEGA6, and JBrowse. Our analysis of nucleotide sequences to find closely evolved organisms with high similarity by NCB...

متن کامل

Phylogenetic Analysis of Three Long Non-coding RNA Genes: AK082072, AK043754 and AK082467

Now, it is clear that protein is just one of the most functional products produced by the eukaryotic genome. Indeed, a major part of the human genome is transcribed to non-coding sequences than to the coding sequence of the protein. In this study, we selected three long non-coding RNAs namely AK082072, AK043754 and AK082467 which show brain expression and local region conservation among vertebr...

متن کامل

In Silico Characterization of Proteins Containing ARID-PHD Domain and Its Expression in Aeluropus littoralis Halophyte

Abiotic stresses are the most important factors that reduce the yield of crops. In this case, Bioinformatics analysis plays an important role to study genes, and their relatedness as well as prediction their function in response to abiotic stresses. Among all domains, ARID-PHD domain has been identified in plants and animals and has a very significant role in growth regulation, cell cycle, and ...

متن کامل

A Novel Image Structural Similarity Index Considering Image Content Detectability Using Maximally Stable Extremal Region Descriptor

The image content detectability and image structure preservation are closely related concepts with undeniable role in image quality assessment. However, the most attention of image quality studies has been paid to image structure evaluation, few of them focused on image content detectability. Examining the image structure was firstly introduced and assessed in Structural SIMilarity (SSIM) measu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013